Helpful Statistics in Recognizing Basic Arabic Phonemes

نویسندگان

  • Mohamed O.M. Khelifa
  • Mostafa Belkasmi
چکیده

The recognition of continuous speech is one of the main challenges in the building of automatic speech recognition (ASR) systems, especially when it comes to phonetically complex languages such as Arabic. An ASR system seems to be actually in a blocked alley. Nearly all solutions follow the same general model. The previous research focused on enhancing its performance by incorporating supplementary features. This paper is part of ongoing research efforts aimed at developing a high-performance Arabic speech recognition system for learning and teaching purposes. It investigates a statistical analysis of certain distinctive features of the basic Arabic phonemes which seems helpful in enhancing the performance of a baseline HMMbased ASR system. The statistics are collected using a particular Arabic speech database, which involves ten different male speakers and more than eight hours of speech which covers all Arabic phonemes. In HMM modeling framework, the statistics provided are helpful in establishing the appropriate number of HMM states for each phoneme and they can also be utilized as an initial condition for the EM estimation procedure, which generally, accelerates the estimation process and, thus, improves the performance of the system. The obtained findings are presented and possible applications of automatic speech recognition and speaker identification systems are also suggested. Keywords—automatic speech recognition (ASR); speech recognizer; phonemes recognition; speech database; hidden Markova models (HMMs)

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sound-Imitation Word Recognition for Environmental Sounds Disambiguation in Determining Phonemes of Sound-Imitation Words

Environmental sounds are very helpful in understanding environmental situations and in telling the approach of danger, and sound-imitation words (sound-related onomatopoeia) are important expressions to inform such sounds in human communication, especially in Japanese language. In this paper, we design a method to recognize sound-imitation words (SIWs) for environmental sounds. Critical issues ...

متن کامل

Disambiguation in determining phonemes of sound-imitation words for environmental sound recognition

Onomatopoeia, or sound-imitation words (SIWs) are important in informing sound events in human-computer communication. One problem is listener-dependency in recognizing environmental sounds by means of SIWs, that is, different listener hears the same environmental sound as a different SIW even under the same condition. Therefore, the use of usual Japanese phonemes is not adequate to express SIW...

متن کامل

Off-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model

In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...

متن کامل

Automatic Arabic Speech Segmentation Syste

growth of information and communication technologies has influenced the research trends n speech technologies. This research explains a basic speech segmentation application for Arabic language with the aim to further develop a language tutor. The focus is on rabic as there are standards available which help in obtaining better accuracy. The roblem has been formulated in the form of a number of...

متن کامل

Visual Speech Analysis,Application to Arabic Phonemes

The aim of this work is to introduce a primary research on Arabic audiovisual analysis. Each language has multiple phonemes and visemes and each viseme can have multiple phonemes. The first part focuses on how to classify Arabic visemes from still images, whereas the second part shows the variation of Pitch for each viseme. We haven’t taken coarticulation of visemes in context. To evaluate the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017